모수

parameters, 모든 확률분포의 모양을 결정짓는 결정적인 수.

예를 들어 정규분포의 경우 평균, 분산 2개의 모수가 존재

대부분의 경우 모수를 정확히 알 수 없고, 이를 추측하는 과정을 추정(Estimation)이라고 한다.

모수 모델(parametric mode)

새로운 데이터 포인트를 분류할 수 있는 함수를 학습하기 위해 훈련 데이터셋에서 모델 파라미터를 추정한다.

훈련이 끝나면 원본 훈련 데이터셋은 더 이상 필요하지 않음.

퍼셉트론, 로지스틱 회귀, 선형SVM 등

Parametric Model: yi=β0+β1xi+eiyi=β0+β1xi+ei

비모수 모델(nonparametric model)

비모수 모델은 고정된 개수의 파라미터로 설명될 수 없다.

훈련 데이터가 늘어나면, 파라미터 개수도 늘어난다.

결정 트리/랜덤 포레스트, 커널SVM 등

Non-Parametric Model: yi=f(xi)+eiyi=f(xi)+ei

In a parametric model, you know which model exactly you will fit to the data, e.g., linear regression line. In a non-parametric model, however, the data tells you what the 'regression' should look like.

Let me give some concrete examples.

Parametric Model: yi=β0+β1xi+eiyi=β0+β1xi+ei

Here you know what the regression will look like: a linear line.

Non-Parametric Model: yi=f(xi)+eiyi=f(xi)+ei

where f(.) can be any function. The data will decide what the function f looks like. Data will not tell you the analytic expression for f(.), but it will give you its graph given your data set.

The reason why people say that there is inherently no difference between parametric and non-parametric regression is that the function f(.) can be perfectly approximated by an infinite-parameter model, which is parametric.

parametric model & nonparametric model